Combining length distribution model with decision tree in prosodic phrase prediction
نویسندگان
چکیده
In Text-to-Speech (TTS) systems, prosody phrase prediction is important for the naturalness and intelligibility of synthesized voice. Statistic methods, such as dynamic programming (DP), decision tree (DT), maximum entropy (ME), etc, have been considered for the task. Features based on syntactic and lexical information are widely used. However, the predicted prosody phrases are often observed to have unrealistic length due to the lack of length distribution modeling. This paper proposes a novel algorithm to incorporate the length distribution model in prosody phrase prediction. Rather than directly use phrase length as a feature of DT or ME, the algorithm exploits the correlation between the length and the possibility given by a decision tree. Experiments show that the recalling rate and precise rate are improved 16.37% and 14.05% relatively by using the proposed algorithm.
منابع مشابه
A Hierarchical Stochastic Model for Automatic Prediction of Prosodic Boundary Location
Prosodic phrase structure provides important information for the understanding and naturalness of synthetic speech, and a good model of prosodic phrases has applications in both speech synthesis and speech understanding. This work describes a statistical model of an embedded hierarchy of prosodic phrase structure, motivated by results in linguistic theory. Each level of the hierarchy is modeled...
متن کاملA Grammar Based Approach to Style Specific Phrase Prediction
We present an approach to style specific phrasing for Text-toSpeech (TTS) systems. We formulate the problem of phrase break prediction (or phrasing) as generation of a sequence of breaks (B) and non-breaks (NB) after each word in a sentence. We use prosodic breaks in speech data to build shallow parses over corresponding text. We then learn a grammar that can predict these shallow prosodic pars...
متن کاملA new prosodic phrasing model for indian language telugu
Prosodic phrasing is an important and more difficult a problem for Indian languages, as the Indian language scripts use very little or no punctuation. This paper reports a preliminary attempt on data-driven modeling of prosodic phrase boundary prediction for the Indian language Telugu. In an effort to identify meaningful features that affect the prosodic phrasing, a new feature, namely mopheme ...
متن کاملProsody prediction for speech synthesis using transformational rule-based learning
Prediction of symbolic prosodic labels (pitch accents and phrase structure) is an important step in generating natural synthetic speech. This paper investigates a new automatically trainable procedure for combined accent and phrase prediction based on transformational rule-based learning. Experimental results on a radio news corpus show that accent prediction bene ts from phrase structure, but ...
متن کاملDecision Tree based Duration Prediction in Mandarin TTS System
This paper reports the methodology and results of decision tree based duration prediction for a Mandarin text-to-speech system developed by the Fujitsu Laboratories. Syllable initials and finals are the basic units in this duration study. Factors influencing finals duration such as phrase boundary and phone context are discussed in detail. Experiments indicate that it is the most important dete...
متن کامل